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1. INTRODUCTION 

Malaysia is one of the world’s leading producers in palm oil industries [1]. Along with the increase 
of production capacity of palm oil every year, a large amount of wastewater was also being generated. These 
uncontrolled discharges of untreated palm oil mill effluent (POME) may cause pollution to the waterways 
[2]. In comparison with conventional activated system, membrane system is preferable to treat POME due to 
its simple operation, easy to scale-up, less weight and space requirements and high efficiency [3]. Membrane 
bioreactor (MBR) has been proven as a reliable technology in treating a wide range of water such as 
wastewater, groundwater and surface water. However, fouling phenomena is the main drawback of MBR 
system which contribute to high energy consumption and maintenance cost [4]. According to [5-7], fouling 
may varies with time during operation and this variation can be minimized by controlling the fouling 
variables [8]. 

Fouling can be controlled and reduced using several hydrodynamic condition techniques such as air 
bubble (aeration) control, relaxation, backwashing, and chemical cleaning [8-10]. It was found that still little 
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work conducted on the development of modelling and optimization for the operation condition of POME 
using MBR. Most of the works focused mainly on biological reduction of POME using MBR filtration [9- 
12]. Modeling of membrane process, involving with large number of parameters that needs to be considered 
is not an easy task. 

Recently, modeling of membrane process using neural network has received enormous attention 
because of their ability in modeling and prediction of complex processes. ANN has been successfully applied 
to predict oily wastewater [13-14], permeate flux of albumin from serum bovine [15] and palm oil mill 
wastewater [16-17]. In addition, a good understanding of factors that affect ANNs model performance is 
crucial to predict the optimum value of the number of iterations, learning rate, momentum coefficient, 
number of hidden layers and number of hidden neurons. The parameters are varied until their optimal value 
are determined [18]. Determination of the best ANN topology is important because it affects the weight and 
bias. Usually it performed by trial and error [19-20] or one-variable-at-time (OVAT) [21-22] where this 
procedure is very time-consuming and monotonous task. According to [23] for three different level of each 
ANN variables, about 245 (=3°) different configuration of ANN would be required. There is no specific rule 
used in selecting the value of variables in ANN. It is dependent on the complexity of the modeled system. 
Thus, it is of importance for researchers in order to find a standard technique to solve the problems associated 
with the ANN development. 

Response surface methodology (RSM) as a collection of statistical and mathematical techniques has 
a capability for optimizing objective functions. It is a powerful optimum design tool in many engineering 
applications and can provide accurate models. RSM technique has been used to determine the ANN topology 
applied for multi-layer feed forward with backpropagation neural network [23-24]. It is also used to find the 
optimum value of neuron number in first and second hidden layers [18]. This paper aims for the development 
of radial basis function neural network (RBFNN) models for prediction of permeate flux during MBR 
filtration of POME wastewater. In this case, the RSM is proposed to find the optimum ANN topology to 
achieve minimum mean square error to improve the performance of the model 


2. RESEARCH METHOD 
2.1. Data collection 

The experiments were carried out using membrane bioreactor for palm oil mill effluent (POME) 
with working volume of 20 L. The sample of POME was taken from Sedenak Palm Oil Mill Sdn. Bhd. in 
Johor, Malaysia with the working temperature at 27 + 1 °C. There are four input variables for the POME 
model including transmembrane pressure (TMP), airflow rate, permeates pump and aeration pump. The 
output variable is permeate flux. The analysis of required data was carried out by using MATLAB R2014a 
and Design Expert version 7.1.6 to obtain the response surface and the contours plot. The total of 1602 data 
for each parameter were collected from the experiment including airflow rate, TMP, permeate pump, aeration 
pump and permeate flux. Figure 1 shows the flux was rapidly decreased after the airflow rate was decreased 
from 8 SLPM to 5 SLPM. 
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Figure 1. Data from MBR filtration experiment 
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2.2. Model development 
In this work, the RBFNN model was used to predict the permeate flux of POME membrane 
bioreactor. Before that, all data need to undergo data pre-processing stage so called normalization. Since the 
input data for this system involved with different magnitude value and scale, all data were normalized into a 
minimum of +0 and maximum of +1. This procedure prevents the transfer function model from becoming 
saturated [25]. Equation (1) used for normalization given as: 
X-Xmin 


5 ee () 


Xmax-Xmin 


where X’ is the scale value, X is the sample value while X,,;, and Xmq, are minimum and maximum value of 
data. The permeate flux was determined as given in (2): 


Vv 
7 (2) 
where J is the permeate flux in (Im~*h71), V is the volume flow rate in liter, A is membrane surface area 
(m?) and t is the time (h). To investigate the feasibility of the predictive model, the collected data were 
separated into three data sets. From the total, 651 for training data set, where this data included the transition 
between high and low airflow rate. The 500 for testing data set was taken from the high airflow and finally, 
451 for validation data set was taken from the low airflow rate. The training data was used to compute the 
network parameters. The testing data was used to assess the predictive ability of the generated model, while 
the remaining validation data was subsequenty used to ensure robustness of the network parameters and to 
avoid over-training [26]. The amount of training data set must be equal or larger than the amount of testing 
and validation data set to avoid extrapolation problem [27]. 

In this paper, three layers of RBFNN which are input, output and hidden were used. The non-linear 
transfer function of hyperbolic tangent sigmoid was used in the hidden layer and the linear transfer function 
of purelin was chosen for the output layer to produce a continuous output. The RSM is used to find the 
optimal value for each learning parameters of RBFNN model. 50 different experiments of central composite 
design (CCD) for five numerical factors (number of neurons, number of spread, learning rate, momentum 
rate and number of epoch) with eight repetition at center point were used. Five numerical factors and 
simulation ranks for RBFNN are shown in Table 1. The experimental results of the CCD were fitted with a 
second-order polynomial equation by a multiple regression technique. For predicting the optia point, the 
quadratic model is expressed by (3): 


Y=Bo + DE, Bix; + Dy Buxix; + DET Daten Bijxix; + Ei (3) 


where Y is the response In(MSE), Bo, 6;, Bi; and P;; are regression coefficients for intercept, linear quadratic 
and interaction coefficients, respectively and x; and x, are independent variables and k is a number of factors. 


Table 1. The range of training parameters 


RBENN Parameter Low High 
x,: No. of neurons 1 20 
Xz: Spread 0.1 2 
x3: Learning rate 0.01 0.4 
x,: Momentum rate 0.01 0.9 
x: Number of Epoch 10 3000 


All ANN topologies were designed and trained using RSM. The obtained quadratic equation was 
solved using response optimizer of RSM until the optimum condition to minimize MSE (response variable) 
data set was found. The MSE were transformed into natural log function (In(MSE)) with a equal to 1. In this 
case, the distribution of the response variable become closer to the normal distribution [24]. 


2.3. Performance evaluation 


4. 2 
MSE = — YE (xpi — Xai) (4) 
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RMSE = EER Cpe ~ xa) 


Re = 1 — Deal vias)” B 
Yie1(Xpi-*) 
where x,; is the predicted output from observation i, xq; is the experimental or actual output form 
observation i, X is the average value of the experimental output and N is the number of data. Smaller values 
of MSE and RMSE mean a better performance of the model. For R? equal to 1 reveals that the regression line 
perfectly fit the data [26]. 


3. RESULTS AND DISCUSSION 

The relationship between the permeate flux and the independent parameters, namely number of 
neuron (x,), spread (x2), learning rate (x3), momentum coefficient (x,) and number of epoch (x) given as 
follows: 


In( MSE) = - 5.76 -1.18 x, — 0.68x, — 0.075x, — 0.041x, — 0.15x, + 0.18x,x, + 
0.077x,x3 + 0.043x, x4 + 0.15x,x5 — 0.051%x2%x3 — 0.043x2x4 — 0.12x2x5 — 

0.043x3x, + 0.040x3x. + 0.041x4x, + 0.81x? + 0.44x3 + 0.010x2 — 0.024x7 + 

0.081x2 (7) 


The fitness of the model is determined by analysis of variance (ANOVA) which consists of sum of 
square (SS), degree of freedom (df), mean square (MS), F-values and P-values as shown in Table 2. The 
significance of each coefficient was determined by the F-test and P-value. The significant of corresponding 
variables would be increase if the absolute F-value becomes greater and the P-value becomes smaller. From 
Table 2, the model gives F-value of 81.25 and very low P-value (< 0.0001). P-values < 0.05 reveal that the 
model terms were significant. The number of neuron had the highest effect on In(MSE) response followed by 
number of spread and number of epoch. The learning rate and momentum coefficient had no significant 
effect on the responses. The prediction R* of 0.9825 is in reasonable agreement with adjusted R*, 0.9704. 
The low value of coefficient of variance (CV=4.62%) which is less than 10 showed that the experiments 
conducted were precise and reliable. 


Table 2. ANOVA for predicted RSM model 


Source SS df MS F-value P-Value Prob > F 
Model 84.45 20 4.22 81.25 < 0.0001 Significant 
x, -Number of neuron 44.62 I 44.62 858.64 < 0.0001 
X_ -Spread 14.74 1 14,74 283.53 < 0.0001 
X3 -Learning rate 0.18 1 0.18 3.50 0.0714 
x4 -Momentum coefficent 0.055 1 0.055 1.05 0.3137 
xs -Number of epoch 0.69 1 0.69 13.21 0.0011 
Residual 1.51 29 0.052 
Lack of Fit 151 21 0.072 
Pure Error 0.000 8 0.000 
Cor Total 85.96 49 
Model statistics 
Std. Dev. 0.23 R-Squared 0.9825 
Mean -4.93 Adj R-Squared 0.9704 
C.V. % 4.62 Pred R-Squared 0.9261 


3.1. Response surface plot results 

The plot of response surface results is presented in Figure 2. Each graph represented a combination 
of two factors at the time and holding all other factors at the middle level. Figure 2(a) shows the response 
surface In(MSE) versus the number of neuron and spread while other factors remained constant at zero level. 
It can be seen from Figure 2(a) that minimum value of In(MSE) can be found by 15-20 neurons and 1.5-2.0 
spread. Moreover, this range was observed for neuron number in relation with epoch number as shown in 
Figure 2(b). The response surface show that along with an increase in number of epoch from 10 to 3000 and 
spread from 0.1 to 2.0, the In(MSE) decreased to - 6.2 as shown in Figure 2(c). The objective function value 
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for In(MSE) is -6.342 for the final points as presented in Figure 2(a). The optimum values given by RSM 
were as follows: number of neurons = 16, spread = 1.4, learning rate = 0.28, momentum rate = 0.3 and 
number of epochs = 1852. These optimum values are used in the training of RBFNN. 
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Figure 2. Response surface and contour plot of In (MSE) for (a) neuron number and spread, (b) neuron 
number and epoch number, and (c) spread and epoch number while other factors remained constant. 


Optimization of artificial neural network topology for membrane bioreactor... (Syahira Ibrahim) 


122 0 ISSN: 2252-8938 


3.2. Neural network plot results 

In this section, the regression plots of the experimental data versus the computed neural network 
data using the optimum ANN topology are presented for each step incuding training, testing and validation 
netwoks. The predicted models were well fitted to the experimental data for all steps as depicted in Figure 3. 
The correlation coefficients (R) for training is 0.9906, for testing is 0.9839 and for validation is 0.9707. The 
comparative values correlation of determination (R?), RMSE and MSE were given in Table 3. The results 
showed that the optimum ANN model is suitable for describing the permeate flux of POME using MBR 
filtration. The optimal topology of ANN using RSM provided good quality prediction for the five exogenous 
outputs. The results have been compared with the conventional RBFNN and showed an improved ANN 
model performance as shown in Table 3. The RBFNN-RSM showed its superiority and faster then trial-and- 
error methods in finding the optimum topology of ANNs. 
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Figure 3. Regression plot for predicted versus experimental flux for (a) training data, (b) testing data and (c) 
validation data 


Table 3. Performance evaluation 


RBFNN-RSM R? MSE RMSE 
Training 0.9813 0.0022 0.0470 
Testing 0.9681 0.0052 0.0722 

Validation 0.9423 0.0217 0.1473 
Conventional RBFNN R? MSE RMSE 
Training 0.9422 0.0076 0.0872 
Testing 0.9374 0.0096 0.0980 
Validation 0.7956 0.0200 0.1414 
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Figures 4(a)-(c) show the response variable of permeate flux for training, testing and validation, 
respectively. For training data which is the transition between high to low airflow rate, the permeate flux 
starts to decrease slowly from 0.88 to 0.60 L/m2 h. For testing data, the permeate flux is at high airflow rate 
and it remains at 0.8 L/m2 h. For validation data, the permeate flux decreases rapidly compared to the 
permeate flux at high airflow. It can be seen that good prediction models are obtained for the permeate flux 
for all data set. 
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Figure 4. POME Permeate flux for (a) training data, (b) testing data and (c) validation data set. 
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4. CONCLUSION 

The folowing conclusion can be drawn from the investigations conducted in this work: The optimal 
ANN topology of RBFNN topology was more precise for predicting permeate flux of POME using MBR 
with low MSE (0.0022) and high correlation coefficient (0.9906). The optimal neural model had minimum 
when the number of neurons, spread, learning rate, momentume rate and number of epochs runs were equal 
to 16, 1.4, 0.28, 0.3 and 1852, respectively. The results of testing and validation model on new trials showed 
excellent agreement between the actual and predicted data with correlation coefficient equal to 0.9839 and 
0.9707, respectively. The application of integrated RBFNN and RSM reduces the computational cost and 
improved the ANN model prediction. 
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